During the determination of DNA sequences, frameshift errors are not t
he most frequent but they are the most bothersome as they corrupt the
amino acid sequence over several residues. Detection of such errors by
sequence alignment is only possible when related sequences are found
in the databases, To avoid this limitation, we have developed a new to
ol based on the distribution of non-overlapping 3-tuples or 6-tuples i
n the three frames of an ORF. The method relies upon the result of a c
orrespondence analysis. It has been extensively tested on Bacillus sub
tilis and Saccharomyces cerevisiae sequences and has also been examine
d with human sequences. The results indicate that it can detect frames
hift errors affecting as few as 20 bp with a low rate of false positiv
es (no more than 1.0/1000 bp scanned). The proposed algorithm can be u
sed to scan a large collection of data, but it is mainly intended for
laboratory practice as a tool for checking the quality of the sequence
s produced during a sequencing project.