S. Karlin, STATISTICAL STUDIES OF BIOMOLECULAR SEQUENCES - SCORE-BASED METHODS, Philosophical transactions-Royal Society of London. Biological sciences, 344(1310), 1994, pp. 391-402
The massive accumulation of DNA and protein sequence data poses challe
nges and opportunities in terms of interpretation and analysis. This p
resentation reviews the method of score-based sequence analysis with t
he objectives of discerning distinctive segments in single sequences a
nd identifying significant common segments in sequence comparisons. A
number of new results are described here for both the theory and its a
pplications. These include distributional theory involving several hig
h scoring segments in single sequences, distribution formulas for gene
ral scoring regimes in multiple sequence comparisons, bounds for perio
dic scoring assignments, sensitivity analysis of genome composition an
d refinements on predicting exons and genes in DNA sequences.