"Frontmatter". In: Plant Genomics and Proteomics
Download 1.13 Mb. Pdf ko'rish
|
Christopher A. Cullis - Plant Genomics and Proteomics-J. Wiley & Sons (2004)
G
ENOME A NNOTATION Among the first features to be placed on newly acquired assembled genomic sequences are the possible open reading frames and splice sites. These two G E N O M E A N N O TAT I O N 7 7 sets of data are combined to identify both already known and putative genes. An example of such an annotation is shown in Figure 4.4 (Ware et al., 2002). All the available information has been added to the sequence including the BAC end sequences, markers for the rice maps including the SSR markers, and positions of predicted genes, submitted genes, and other EST data from rice, maize, Hordeum, Triticum, and Sorghum. In all gene predictions from genomic DNA the precise identity of the gene boundaries and exon-intron structure is hindered by the lack of supporting experimental evidence. Full-length cDNA sequences and bio- informatics software can produce insights on the structure of genes in chromosomal DNA. Therefore, full-length cDNA sequences are essential for confirmation of the predicted genes within a sequenced genome. Having a full-length cDNA enables the checking of both the extent of the coding region of the gene as well as the sequences immediately 5¢ and 3¢ from the coding sequence. In addition, having a full length cDNA makes it possible to train the gene finding programs so that the unknown regions of the genome can be more accurately annotated as far as the presence of genes is concerned. The availability of many full-length cDNAs and trained gene finding pro- grams from a small number of model plants will also ease the identification of genes in partial genomic sequences of more exotic plant species. S YNTENY As the full genomes of Arabidopsis and rice are more precisely annotated, the finding and isolation of potential genes in other, less well-defined systems may be possible with reference to the position of the sequence in a particu- lar cluster of genes. However, these predictions are likely to be complicated by the presence of multiple copies of genes, the divergence between paralogs and orthologs (see Chapter 1) in other species, and the micro- and macro- rearrangements of the chromosomes over evolutionary time. Therefore, any candidates will need to be extensively characterized to demonstrate that they are performing the same function in both time and space. Download 1.13 Mb. Do'stlaringiz bilan baham: |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling