ITA
ENG

The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome

Authors

Camargo, AA Samaia, HPB Dias-Neto, E Simao, DF Migotto, IA Briones, MRS Costa, FF Nagai, MA Verjovski-Almeida, S Zago, MA Andrade, LEC Carrer, H El-Dorry, HFA Espreafico, EM Habr-Gama, A Giannella-Neto, D Goldman, GH Gruber, A Hackel, C Kimura, ET Maciel, RMB Marie, SKN Martins, EAL Nobrega, MP Paco-Larson, ML Pardini, MIMC Pereira, GG Pesquero, JB Rodrigues, V Rogatto, SR da Silva, IDCG Sogayar, MC Sonati, MDF Tajara, EH Valentini, SR Alberto, FL Amaral, MEJ Aneas, I Arnaldi, LAT de Assis, AM Bengtson, MH Bergamo, NA Bombonato, V de Camargo, MER Canevari, RA Carraro, DM Cerutti, JM Correa, MLC Correa, RFR Costa, MCR Curcio, C Hokama, POM Ferreira, AJS Furuzawa, GK Gushiken, T Ho, PL Kimura, E Krieger, JE Leite, LCC Majumder, P Marins, M Marques, ER Melo, ASA Melo, M Mestriner, CA Miracca, EC Miranda, DC Nascimento, ALTO Nobrega, FG Ojopi, EPB Pandolfi, JRC Pessoa, LG Prevedel, AC Rahal, P Rainho, CA Reis, EMR Ribeiro, ML da Ros, N de Sa, RG Sales, MM Sant'anna, SC dos Santos, ML da Silva, AM da Silva, NP Silva, WA da Silveira, RA Sousa, JF Stecconi, D Tsukumo, F Valente, V Soares, F Moreira, ES Nunes, DN Correa, RG Zalcberg, H Carvalho, AF Reis, LFL Brentani, RR Simpson, AJG de Souza, SJ

Citation

Aa. Camargo et al., The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome, P NAS US, 98(21), 2001, pp. 12103-12108

Citations number

Categorie Soggetti

Multidisciplinary

Journal title

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

ISSN journal

00278424 → ACNP

Volume

Issue

Year of publication

2001

Pages

12103 - 12108

Database

ISI

SICI code

0027-8424(20011009)98:21<12103:TCO7OS>2.0.ZU;2-Q

Abstract

open reading frame expressed sequences tags (ORESTES) differ from conventio nal ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15, 095 full-length mRNAs as a means of assessing the efficiency of the strateg y and its potential contribution to the definition of the human transcripto me. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generate d are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly an d poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence g eneration significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a sc affold of partial sequences distributed along the length of each gene produ ct. The experimental joining of the scaffold components, by reverse transcr iption-PCR, represents a direct route to transcript finishing that may repr esent a useful alternative to full-length cDNA cloning.