Drosophila genomic sequence annotation using the BLOCKS plus database

Citation
Jg. Henikoff et S. Henikoff, Drosophila genomic sequence annotation using the BLOCKS plus database, GENOME RES, 10(4), 2000, pp. 543-546
Citations number
20
Categorie Soggetti
Molecular Biology & Genetics
Journal title
GENOME RESEARCH
ISSN journal
10889051 → ACNP
Volume
10
Issue
4
Year of publication
2000
Pages
543 - 546
Database
ISI
SICI code
1088-9051(200004)10:4<543:DGSAUT>2.0.ZU;2-8
Abstract
A simple and general homology-based method for gene finding was applied to the 2.9-Mb Drosophila melanogaster Adh region, the target sequence of the G enome Annotation Assessment Project (GASP). Each strand of the entire seque nce was used as query of the BLOCKS database of conserved regions of protei ns. This led to functional assignments For more than one-third of the genes and two-thirds of the transposons. Considering the enormous size of the qu ery, the fact that only two false-positive matches were reported emphasizes the high selectivity of protein family-based methods for gene finding. We used the search results to improve BLOCKS+ by identifying compositionally b iased blocks. Our results confirm that protein family databases can be used effectively in automated sequence annotation efforts.