J. Burke et al., ALTERNATIVE GENE FORM DISCOVERY AND CANDIDATE GENE SELECTION FROM GENE INDEXING PROJECTS, PCR methods and applications, 8(3), 1998, pp. 276-290
Several efforts are under way to partition single-read expressed seque
nce tag (EST), as well as full-length transcript data, into large-scal
e gene indices, where transcripts are in common index classes if and o
nly if they share a common progenitor gene. Accurate gene indexing fac
ilitates gene expression studies, as well as inexpensive and early gen
e sequence discovery through assembly of ESTs that are derived From ge
nes that have not been sequenced by classical methods. We extend, corr
ect, and enhance the information obtained from index groups by splitti
ng index classes into subclasses based on sequence dissimilarity (dive
rsity). Two applications of this are highlighted in this report. First
it is shown that our method can ameliorate the damage that artifacts,
such as chimerism, inflict on index integrity. Additionally, we demon
strate how the organization imposed by an effective subpartition can g
reatly increase the sensitivity of gene expression studies by accounti
ng for the existence and tissue-or pathology-specific regulation of no
vel gene isoforms and polymorphisms. We apply our subpartitioning trea
tment to the UniGene gene indexing project to measure a marked increas
e in information quality and abundance (in terms of assembly length an
d insertion/deletion error) after treatment and demonstrate cases wher
e new levels of information concerning differential expression of alte
rnate gene forms, such as regulated alternative splicing, are discover
ed.