"Frontmatter". In: Plant Genomics and Proteomics
Download 1.13 Mb. Pdf ko'rish
|
Christopher A. Cullis - Plant Genomics and Proteomics-J. Wiley & Sons (2004)
S
EQUENCING AND D ATA P ROCESSING The sequencing activity is essentially similar for all of the approaches. With the MTP, the BACs are usually shotgun cloned into a new vector and the resulting fragments sequenced to a coverage to ensure that every possible fragment is included. The sequences of all the fragments are then assembled into the linear order in which they were in the original BAC clone. Obvi- ously, if a BAC clone contains copies of a repetitive sequence, then the assem- bly of the complete BAC sequence will be more difficult. Most of the high-throughput sequencing centers are based on the use of ABI DNA sequencers and fluorescent DNA sequencing chemistry with soft- ware for base calling, trace trimming, and quality assessment to ensure a uniform data standard for genome assembly (Ewing and Green, 1998; Ewing et al., 1998). The first step in generating an assembly of shotgun sequence data is to group the shotgun sequences together into clusters of overlapping sequences. The second step is usually to check the quality of the sequence reads and then to identify possible contaminating vector or other sequences missed when the initial trimming of the sequences was done. Cloned sequences are then usually compared with sequences in public and other accessible databases, such as the GenBank Nr (nonredundant) and EST Data- bases and classified according to the nature and significance of their BLAST hits. Any repetitive sequences that are already known for the species under consideration can be masked to eliminate these from the analysis. These analyses depend on robust and reliable clustering protocols that are sufficiently stringent to avoid errors in the clustering of gene families but also relaxed enough for appropriate groupings to be found. All the sequences together with their assembly and analysis can then be stored in an appro- priate database. Because whole genome sequencing is usually a cooperative effort distributed among many laboratories worldwide, a common and inte- grated information environment is essential so that detailed tracking and control of the information processes can be achieved. Download 1.13 Mb. Do'stlaringiz bilan baham: |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling