"Frontmatter". In: Plant Genomics and Proteomics


Download 1.13 Mb.
Pdf ko'rish
bet27/87
Sana23.02.2023
Hajmi1.13 Mb.
#1225741
1   ...   23   24   25   26   27   28   29   30   ...   87
Bog'liq
Christopher A. Cullis - Plant Genomics and Proteomics-J. Wiley & Sons (2004)

S
EQUENCING AND
D
ATA
P
ROCESSING
The sequencing activity is essentially similar for all of the approaches. With
the MTP, the BACs are usually shotgun cloned into a new vector and the
resulting fragments sequenced to a coverage to ensure that every possible
fragment is included. The sequences of all the fragments are then assembled
into the linear order in which they were in the original BAC clone. Obvi-
ously, if a BAC clone contains copies of a repetitive sequence, then the assem-
bly of the complete BAC sequence will be more difficult. 
Most of the high-throughput sequencing centers are based on the use of
ABI DNA sequencers and fluorescent DNA sequencing chemistry with soft-
ware for base calling, trace trimming, and quality assessment to ensure a
uniform data standard for genome assembly (Ewing and Green, 1998; Ewing
et al., 1998). 
The first step in generating an assembly of shotgun sequence data is 
to group the shotgun sequences together into clusters of overlapping
sequences. The second step is usually to check the quality of the sequence
reads and then to identify possible contaminating vector or other sequences
missed when the initial trimming of the sequences was done. Cloned
sequences are then usually compared with sequences in public and other
accessible databases, such as the GenBank Nr (nonredundant) and EST Data-
bases and classified according to the nature and significance of their BLAST
hits. Any repetitive sequences that are already known for the species under
consideration can be masked to eliminate these from the analysis.
These analyses depend on robust and reliable clustering protocols that
are sufficiently stringent to avoid errors in the clustering of gene families but
also relaxed enough for appropriate groupings to be found. All the sequences
together with their assembly and analysis can then be stored in an appro-
priate database. Because whole genome sequencing is usually a cooperative
effort distributed among many laboratories worldwide, a common and inte-
grated information environment is essential so that detailed tracking and
control of the information processes can be achieved. 

Download 1.13 Mb.

Do'stlaringiz bilan baham:
1   ...   23   24   25   26   27   28   29   30   ...   87




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling