H. Salamon et al., ACCOMMODATING ERROR ANALYSIS IN COMPARISON AND CLUSTERING OF MOLECULAR FINGERPRINTS, EMERGING INFECTIOUS DISEASES, 4(2), 1998, pp. 159-168
Molecular epidemiologic studies of infectious diseases rely on pathoge
n genotype comparisons, which usually yield patterns comprising sets o
f DNA fragments (DNA fingerprints). We use a highly developed genotypi
ng system, IS6110-based restriction fragment length polymorphism analy
sis of Mycobacterium tuberculosis, to develop a computational method t
hat automates comparison of targe numbers of fingerprints. Because err
or in fragment length measurements is proportional to fragment length
and is positively correlated for fragments within a lane, an align-and
-count method that compensates for relative scaling of lanes reliably
counts matching fragments between lanes. Results of a two-step method
we developed to cluster identical fingerprints agree closely with 5 ye
ars of computer-assisted visual matching among 1,335 M. tuberculosis f
ingerprints. Fully documented and validated methods of automated compa
rison and clustering will greatly expand the scope of molecular epidem
iology.