A SYSTEM FOR PATTERN-MATCHING APPLICATIONS ON BIOSEQUENCES

Authors
Citation
G. Mehldau et G. Myers, A SYSTEM FOR PATTERN-MATCHING APPLICATIONS ON BIOSEQUENCES, Computer applications in the biosciences, 9(3), 1993, pp. 299-314
Citations number
17
Categorie Soggetti
Mathematical Methods, Biology & Medicine","Computer Sciences, Special Topics","Computer Applications & Cybernetics","Biology Miscellaneous
ISSN journal
02667061
Volume
9
Issue
3
Year of publication
1993
Pages
299 - 314
Database
ISI
SICI code
0266-7061(1993)9:3<299:ASFPAO>2.0.ZU;2-0
Abstract
ANREP is a system for finding matches to patterns composed of (i) spac ing constraints called 'spacers', and (ii) approximate matches to ''mo tifs' that are, recursively, patterns composed of 'atomic' symbols. A user specifies such patterns via a declarative, free-format and strong ly typed language called A that is presented here in a tutorial style through a series of progressively more complex examples. The sample pa tterns are for protein and DNA sequences, the application domain for w hich ANREP was specifically created. ANREP provides a unified framewor k for almost all previously proposed biosequence patterns and extends them by providing approximate matching, a feature heretofore unavailab le except for the limited case of individual sequences. The performanc e of ANREP is discussed and an appendix gives a concise specification of syntax and semantics, A portable C software package implementing AN REP is available via anonymous remote file transfer.