Distance-based pattern matching of DNA sequences for evaluating primary mutation

2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE) ◽

10.1109/icitisee.2017.8285518 ◽

2017 ◽

Author(s):

Berlian Al Kindhi ◽

Muhammad Afif Hendrawan ◽

Diana Purwitasari ◽

Tri Arief Sardjono ◽

Mauridhi Hery Purnomo

Keyword(s):

Pattern Matching ◽

Dna Sequences ◽

Primary Mutation

Download Full-text

EPMA: Efficient pattern matching algorithm for DNA sequences

Expert Systems with Applications ◽

10.1016/j.eswa.2017.03.026 ◽

2017 ◽

Vol 80 ◽

pp. 162-170 ◽

Cited By ~ 6

Author(s):

Muhammad Tahir ◽

Muhammad Sardaraz ◽

Ataul Aziz Ikram

Keyword(s):

Pattern Matching ◽

Dna Sequences ◽

Matching Algorithm ◽

Pattern Matching Algorithm

Download Full-text

On the Probability of Pattern Matching in Nonaligned DNA Sequences: A Finite Markov Chain Imbedding Approach

Scan Statistics and Applications ◽

10.1007/978-1-4612-1578-3_13 ◽

1999 ◽

pp. 287-302 ◽

Cited By ~ 4

Author(s):

James C. Fu ◽

W. Y. Wendy Lou ◽

S. C. Chen

Keyword(s):

Markov Chain ◽

Pattern Matching ◽

Dna Sequences ◽

Finite Markov Chain ◽

Finite Markov Chain Imbedding ◽

Markov Chain Imbedding

Download Full-text

Weak factor automata: the failure of failure factor oracles?

South African Computer Journal ◽

10.18489/sacj.v53i0.199 ◽

2014 ◽

Vol 53 ◽

Author(s):

Loek Cleophas ◽

Derrick G. Kourie ◽

Bruce W. Watson

Keyword(s):

Pattern Matching ◽

Dna Sequences ◽

Finite Automata ◽

Compact Representation ◽

Data Sets ◽

Matching Algorithm ◽

Weak Factor ◽

Ex Post ◽

Ex Post Facto ◽

Matching Performance

In indexing of, and pattern matching on, DNA and text sequences, it is often important to represent all factors of a sequence. One efficient, compact representation is the factor oracle (FO). At the same time, any classical deterministic finite automata (DFA) can be transformed to a so-called failure one (FDFA), which may use failure transitions to replace multiple symbol transitions, potentially yielding a more compact representation. We combine the two ideas and directly construct a failure factor oracle (FFO) from a given sequence, in contrast to ex post facto transformation to an FDFA. The algorithm is suitable for both short and long sequences. We empirically compared the resulting FFOs and FOs on number of transitions for many DNA sequences of lengths 4 − 512, showing gains of up to 10% in total number of transitions, with failure transitions also taking up less space than symbol transitions. The resulting FFOs can be used for indexing, as well as in a variant of the FO-using backward oracle matching algorithm. We discuss and classify this pattern matching algorithm in terms of the keyword pattern matching taxonomies of Watson, Cleophas and Zwaan. We also empirically compared the use of FOs and FFOs in such backward reading pattern matching algorithms, using both DNA and natural language (English) data sets. The results indicate that the decrease in pattern matching performance of an algorithm using an FFO instead of an FO may outweigh the gain in representation space by using an FFO instead of an FO.

Download Full-text

Efficient Pattern Matching Algorithms for DNA Sequences

2020 25th International Computer Conference, Computer Society of Iran (CSICC) ◽

10.1109/csicc49403.2020.9050070 ◽

2020 ◽

Author(s):

Peyman Neamatollahi ◽

Montassir Hadi ◽

Mahmoud Naghibzadeh

Keyword(s):

Pattern Matching ◽

Dna Sequences

Download Full-text

Compressed Pattern Matching in DNA Sequences Using Multithreaded Technology

2009 3rd International Conference on Bioinformatics and Biomedical Engineering ◽

10.1109/icbbe.2009.5162550 ◽

2009 ◽

Author(s):

Piyuan Lin ◽

Shaopeng Liu ◽

Lixia Zhang ◽

Peijie Huang

Keyword(s):

Pattern Matching ◽

Dna Sequences

Download Full-text

Pattern-matching search of DNA sequences using logic grammars

[1991] Proceedings. The Seventh IEEE Conference on Artificial Intelligence Application ◽

10.1109/caia.1991.120837 ◽

2002 ◽

Author(s):

D.B. Searls ◽

M.O. Noordewier

Keyword(s):

Pattern Matching ◽

Dna Sequences ◽

Logic Grammars

Download Full-text

Fast bitwise pattern-matching algorithm for DNA sequences on modern hardware

TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES ◽

10.3906/elk-1304-165 ◽

2015 ◽

Vol 23 ◽

pp. 1405-1417 ◽

Cited By ~ 3

Author(s):

Gıyasettin ÖZCAN ◽

Osman Sabri ÜNSAL

Keyword(s):

Pattern Matching ◽

Dna Sequences ◽

Matching Algorithm ◽

Pattern Matching Algorithm

Download Full-text

A Pattern Matching Approach for the Estimation of Alignment Between Any Two Given DNA Sequences

Journal of Medical Systems ◽

10.1007/s10916-007-9062-3 ◽

2007 ◽

Vol 31 (4) ◽

pp. 247-253 ◽

Cited By ~ 2

Author(s):

K. Basu ◽

N. Sriraam ◽

R. J. A. Richard

Keyword(s):

Pattern Matching ◽

Dna Sequences

Download Full-text

Pattern Matching for DNA Sequencing Data Using Multiple Bloom Filters

BioMed Research International ◽

10.1155/2019/7074387 ◽

2019 ◽

Vol 2019 ◽

pp. 1-9 ◽

Cited By ~ 3

Author(s):

Maleeha Najam ◽

Raihan Ur Rasool ◽

Hafiz Farooq Ahmad ◽

Usman Ashraf ◽

Asad Waqar Malik

Keyword(s):

Dna Sequencing ◽

Dna Sequence ◽

Pattern Matching ◽

Dna Sequences ◽

Sequence Data ◽

Bloom Filters ◽

Sequencing Data ◽

Dna Sequence Data ◽

Efficient Data ◽

Improved Accuracy

Storing and processing of large DNA sequences has always been a major problem due to increasing volume of DNA sequence data. However, a number of solutions have been proposed but they require significant computation and memory. Therefore, an efficient storage and pattern matching solution is required for DNA sequencing data. Bloom filters (BFs) represent an efficient data structure, which is mostly used in the domain of bioinformatics for classification of DNA sequences. In this paper, we explore more dimensions where BFs can be used other than classification. A proposed solution is based on Multiple Bloom Filters (MBFs) that finds all the locations and number of repetitions of the specified pattern inside a DNA sequence. Both of these factors are extremely important in determining the type and intensity of any disease. This paper serves as a first effort towards optimizing the search for location and frequency of substrings in DNA sequences using MBFs. We expect that further optimizations in the proposed solution can bring remarkable results as this paper presents a proof of concept implementation for a given set of data using proposed MBFs technique. Performance evaluation shows improved accuracy and time efficiency of the proposed approach.

Download Full-text

Compressed pattern matching in DNA sequences

Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004. ◽

10.1109/csb.2004.1332418 ◽

2004 ◽

Cited By ~ 4

Author(s):

Lei Chen ◽

Shiyong Lu ◽

J. Ram

Keyword(s):

Pattern Matching ◽

Dna Sequences

Download Full-text