Preliminary profile of the Cryptosporidium parvum genome: an expressed sequence tag and genome survey sequence analysis

Citation
Wb. Strong et Rg. Nelson, Preliminary profile of the Cryptosporidium parvum genome: an expressed sequence tag and genome survey sequence analysis, MOL BIOCH P, 107(1), 2000, pp. 1-32
Citations number
103
Categorie Soggetti
Microbiology
Journal title
MOLECULAR AND BIOCHEMICAL PARASITOLOGY
ISSN journal
01666851 → ACNP
Volume
107
Issue
1
Year of publication
2000
Pages
1 - 32
Database
ISI
SICI code
0166-6851(20000315)107:1<1:PPOTCP>2.0.ZU;2-C
Abstract
Cryptosporidium parvum is a protozoan enteropathogen that infects humans an d animals and causes a pronounced diarrheal disease that can be life-threat ening in immunocompromised hosts. No specific chemo- or immunotherapies exi st to treat cryptosporidiosis and little molecular information is available to guide development of such therapies. To accelerate gene discovery and i dentify genes encoding potential drug and vaccine targets we constructed sp orozoite cDNA and genomic DNA sequencing libraries from the Iowa isolate of C. parvum and determined similar to 2000 sequence tags by single-pass sequ encing of random clones. Together, the 567 expressed sequence tags (ESTs) a nd 1507 genome survey sequences (GSSs) totaled one megabase (1 mb) of uniqu e genomic sequence indicating that similar to 10% of the 10.4 mb C. parvum genome has been sequence tagged in this gene discovery expedition. The tags were used to search the public nucleic acid and protein databases via BLAS T analyses, and 180 ESTs (32%) and 277 GSSs (18%) exhibited similarity with database sequences at smallest sum probabilities P(N) less than or equal t o 10(-8). Some tags encoded proteins with clear therapeutic potential inclu ding S-adenosylhomocysteine hydrolase? histone deacetylase. polyketide/fatt y-acid synthases, various cyclophilins, thrombospondin-I-elated cysteine-ri ch protein and ATP-binding-cassette transporters. Several anonymous ESTs en coded proteins predicted to contain signal peptides or multiple transmembra ne spanning segments suggesting they were destined for membrane-bound compa rtments, the cell surface or extracellular secretion. One-hundred four simp le sequence repeats were identified within the nonredundant sequence tag co llection with (TAA)(greater than or equal to 6/)(TTA)(greater than or equal to 6) and (TA)(greater than or equal to 10)/(AT)(greater than or equal to 10) being the most prevalent, occurring 40 and 15 times, respectively. Vari ous cellular RNAs and their genes were also identified including the small and large ribosomal RNAs, five tRNAs, the U2 small nuclear RNA, and the sma ll and large virus-like, double-stranded RNAs. This investigation has demon strated that survey sequencing is an efficient procedure for gene discovery and genome characterization and has identified and sequence tagged many C. parvum genes encoding potential therapeutic targets. (C) 2000 Elsevier Sci ence B.V. All rights reserved.