Toward high-throughput genotyping: Dynamic and automatic software for manipulating large-scale genotype data using fluorescently labeled dinucleotidemarkers
Jl. Li et al., Toward high-throughput genotyping: Dynamic and automatic software for manipulating large-scale genotype data using fluorescently labeled dinucleotidemarkers, GENOME RES, 11(7), 2001, pp. 1304-1314
To efficiently manipulate large amounts of genotype data generated with flu
orescently labeled dinucleotide markers, we developed a Microsoft Access da
tabase management system, named GenoDB. GenoDB offers several advantages. F
irst, it accommodates the dynamic nature of the accumulations of genotype d
ata during the genotyping process; some data need to be confirmed or replac
ed by repeat lab procedures. By using GenoDB, the raw genotype data can be
imported easily and continuously and incorporated into the database during
the genotyping process that may continue over an extended period of time in
large projects. Second, almost all of the procedures are automatic, includ
ing autocomparison of the raw data read by different technicians from the s
ame gel, autoadjustment among the allele fragment-size data from cross-runs
or cross-platforms, autobinning of alleles, and autocompilation of genotyp
e data for suitable programs to perform inheritance check in pedigrees. Thi
rd, GenoDB provides functions to track electrophoresis gel files to locate
gel or sample sources for any resultant genotype data, which is extremely h
elpful for double-checking consistency of raw and final data and for direct
ing repeat experiments. In addition, the user-friendly graphic interface of
GenoDB renders processing of large amounts of data much less labor-intensi
ve. Furthermore, GenoDB has built-in mechanisms to detect some genotyping e
rrors and to assess the quality of genotype data that then are summarized i
n the statistic reports automatically generated by GenoDB. The GenoDB can e
asily handle >500,000 genotype data entries, a number more than sufficient
for typical whole-genome linkage studies. The modules and programs we devel
oped for the GenoDB can be extended to other database platforms, such as Mi
crosoft SQL server, if the capability to handle still greater quantities of
genotype data simultaneously is desired.