Mouse BAC ends quality assessment and sequence analyses

Citation
Sy. Zhao et al., Mouse BAC ends quality assessment and sequence analyses, GENOME RES, 11(10), 2001, pp. 1736-1745
Citations number
38
Categorie Soggetti
Molecular Biology & Genetics
Journal title
GENOME RESEARCH
ISSN journal
10889051 → ACNP
Volume
11
Issue
10
Year of publication
2001
Pages
1736 - 1745
Database
ISI
SICI code
1088-9051(200110)11:10<1736:MBEQAA>2.0.ZU;2-N
Abstract
A large-scale BAC end-sequencing project at The Institute for Genomic Resea rch (TIGR) has generated one of the most extensive sets of sequence markers for the mouse genome to date. With a sequencing success rate of > 80%, an average read length of 485 bp, and ABI3700 capillary sequencers, we have ge nerated 449,234 nonredundant mouse BAC end sequences (mBESs) with 218 Mb to tal from 257,318 clones from libraries RPCI-23 and RPCI-24, representing 15 x clone coverage, 7% sequence coverage, and a marker every 7 kb across the genome. A total of 191,916 BACs have sequences from both ends providing 12x genome coverage. The average Q20 length is 406 bp and 84% of the bases hav e phred quality scores greater than or equal to 20. RPCI-24 mBESs have more Q20 bases and longer reads on average than RPCI-23 sequences. ABI3700 sequ encers and the sample tracking system ensure that > 95% of mBESs are associ ated with the right clone identifiers. We have found that a significant fra ction of mBESs contains LI repeats and similar to 48% of the clones have bo th ends with greater than or equal to 100 bp contiguous unique Q20 bases. A bout 3% mBESs match ESTs and > 70% of matches were conserved between the mo use and the human or the rat. Approximately 0.1% mBESs contain STSs. About 0.2% mBESs match human finished sequences and > 70% of these sequences have EST hits. The analyses indicate that our high-quality mouse BAC end sequen ces will be a valuable resource to the community.