Qp. Yuan et al., Rice bioinformatics. Analysis of rice sequence data and leveraging the data to other plant species, PLANT PHYSL, 125(3), 2001, pp. 1166-1174
Rice (Oryza sativa) is a model species for monocotyledonous plants, especia
lly for membvis in the grass family. Several attributes such as small genom
e size, diploid nature, transformability, and establishment of genetic and
molecular resourses make it a tractable organism for plant biologists. With
an estimated genome size of 430 Mb (Arumuganathan and Earle, 1991), it is
feasible to obtain the complete genome sequence of rice using current techn
ologies. An international effort has been established and is in the process
of sequencing O. sativa spp, japonica var "Nipponbare" using a bacterial a
rtificial chromosome/P1 artificial chromosome shotgun sequencing strategy.
Annotation of the rice genome is performed using prediction-based and homol
ogy-based searches to identify genes. Annotation tools such as optimized ge
ne prediction programs are being developed for rice to improve the quality
of annotation. Resources are also being developed to leverage the rice geno
me sequence to partial genome projects such as expressed sequence tag proje
cts, thereby maximising the output from the rice genome project. To provide
a low level of annotation for rice genomic sequences, we have aligned all
rice bacterial artificial chromosome/P1 artificial chromosome sequences wit
h The Institute of Genomic Research Gene Indices that are a set of nonredun
dant transcripts that are generated from nine public plant expressed sequen
ce tag projects (rice, wheat, Sorghum, maize, barley, Arabidopsis, tomato,
potato, and barrel medic). In addition, we have used data from The Institut
e of Genomic Research Gene Indices and the Arabidopsis and Rice Genome Proj
ects to identify putative orthologues and paralogues among these nine genom
es.