ITA
ENG

COMPARATIVE-ANALYSIS OF 1196 ORTHOLOGOUS MOUSE AND HUMAN FULL-LENGTH MESSENGER-RNA AND PROTEIN SEQUENCES

Authors

MAKALOWSKI W ZHANG JH BOGUSKI MS

Citation

W. Makalowski et al., COMPARATIVE-ANALYSIS OF 1196 ORTHOLOGOUS MOUSE AND HUMAN FULL-LENGTH MESSENGER-RNA AND PROTEIN SEQUENCES, PCR methods and applications, 6(9), 1996, pp. 846-857

Citations number

Categorie Soggetti

Biothechnology & Applied Migrobiology",Biology

Journal title

PCR methods and applications → ACNP

ISSN journal

10549803

Volume

Issue

Year of publication

1996

Pages

846 - 857

Database

ISI

SICI code

1054-9803(1996)6:9<846:CO1OMA>2.0.ZU;2-4

Abstract

A large set of mRNA and encoded protein sequences, from orthologous mu rine and human genes, was compiled to analyze statistical, biological, and evolutionary properties of coding and noncoding transcribed seque nces. Protein sequence conservation varied between 36% and 100% identi ty, with an average value of 85%. The average degree of nucleotide seq uence identity for the corresponding coding sequences was also similar to 85%, whereas 5' and 3' untranslated regions (UTRs) were less conse rved, with aligned identities of 67% and 69%, respectively. For some m ouse and human genes, nucleotide sequences are more highly conserved t han the encoded protein sequences. A subset of 32 sequences, consistin g of only mouse/human protein pairs for which the human sequence repre sents a positionally cloned disease gene, had properties very similar to the larger data set, suggesting that our data are representative of the genome as a whole. With respect to sequence conservation, two int eresting outliers are the breast cancel (BRCA1) gene product and the t estis-determining factor (SRY), both of which display among the lowest degrees of sequence identity. The occurrence of both introns and repe titive elements (e.g., Alu, B1) in 5' and 3' UTRs was also studied. Th ese results provide one benchmark for the ''comparative genomics'' of mice and humans, with practical implications for the cross-referencing , of transcript maps. Also, they should prove useful in estimating the additional sampling diversity provided by mouse EST sequencing projec ts designed to complement the existing human cDNA collection.