"Frontmatter". In: Plant Genomics and Proteomics
Download 1.13 Mb. Pdf ko'rish
|
Christopher A. Cullis - Plant Genomics and Proteomics-J. Wiley & Sons (2004)
G
ENERATING A P HYSICAL M AP The current technology for generating a physical map usually involves ordering a series of BAC clones (Marra et al., 1997). The BAC libraries can be generated by using a number of different restriction enzymes (see Chapter 2) or by random shearing. The clones from these libraries are then finger- printed. The fingerprinting involves isolating the BAC DNAs, digesting them with a restriction enzyme, and running the fragments generated on a gel. The overlap between different BACs is calculated from the number of identically sized fragments that they have in common (Soderlund et al., 2000). When a sufficient overlap between two BACs is found (the cutoff is user defined), those two BACs are assumed to have the region containing the identically sized bands in common and are therefore placed in a contig (contiguous sequence) (Figure 3.2). The comparisons and assembly can be automated with the use of image software to analyze the gels, and the fin- G E N E R AT I N G A P H Y S I C A L M A P 5 1 gerprint contigs (FPC) software is used to assemble the contigs (Soderlund et al., 2000). All the data generated from thousands of BACs are assembled, and the assembly constitutes the physical map. This assembly must be checked by confirming that the genes, markers, or sequences that are known to be on the BACs assembled into specific contigs are actually present in close proximity on the genetic map. Ultimately, all of the data that have been accu- mulated from the molecular mapping exercises can also be placed on the physical map. Thus where a unique molecular marker is hybridized to the BAC library, the selected fingerprinted BACs to which it hybridizes are immediately anchored to the position of the chromosome assigned to the marker. If the genetic and physical maps disagree, then the conflict must be resolved to determine which assembly, the genetic map or the BAC contig, is correct. Such physical maps based on fingerprinted contigs have been con- structed for a number of species including humans, Arabidopsis, and rice. EST markers that have been mapped to BAC clones can be entered into the FPC database. These data will help the assembly process, as well as placing the ESTs on both anchored and unanchored contigs (Soderlund et 5 2 3. S E Q U E N C I N G S T R AT E G I E S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 a b c d e f g h i k l m n o p q r FIGURE 3.2. Illustration of the DNA fingerprinting method. Horizontal lines (a–r) represent individual BAC clones that have been aligned based on shared restriction fragment patterns. Vertical lines (1–17) represent the positions of restriction sites that were used. For actual data, the comparison usually includes 20–40 bands. In the figure the overlap between clones a and b is very likely to be correct. However, without the intermediate clones b through h, the overlap between a and i would be tenuous at best. As can be seen from the diagram, to be certain about the relative overlaps the same region must be sampled multiple times, with 13 different BACs contained within the length of a. al., 2000). The FPC software can use both the fingerprints and the markers in generating the assembly, so if two clones share a marker, then a less strin- gent overlap based on fingerprints will still be recorded as an overlap. Because FPC can generate incremental updates, the contigs can be built as the data are generated, rather than having to await all the data and having a massive one-time final build of the physical map. How many BACs are needed to achieve such a physical map (assuming that there are no structural impediments to achieving the overlapping sets such as identical duplicated regions)? If we assume that the BAC library was generated with the average insert size of 125 kb, then a complete genome would be contained in about 1000 BACs for Arabidopsis ( a genome size of 125 Mb), 3512 BACs for rice (genome size of 439 Mb), 21,728 BACs for maize (genome size of 2,716 Mb), and 128,000 BACs for wheat (genome size of 16,000 Mb). As can be seen from Figure 3.2 multiple sets of the genome must be fingerprinted, perhaps up to 20 times the number required for a complete genome so that enough representatives from each region can be sampled. Thus a twentyfold (20 x ) oversampling would mean that the numbers for Ara- bidopsis rise to 20,000, for rice to 70,000, for maize to 435,000, and for wheat to 2,560,000 BACs. Therefore, as the genome size increases the number of BACs that must be fingerprinted to get some meaningful assembly of the genome also rises. Even with this level of oversampling, most of the plant genomes would not be assembled into the number of contigs that is the same as the number of chromosomes (the ideal result). In general, the number of contigs will be much larger than the chromosome number and the average size of the contigs much smaller than the length of the chromosome. These contigs can then be placed on the genetic map by using molecu- lar markers that hybridize to the BACs within a contig to determine the order of the contigs along the chromosome. Obviously, for this ordering along the genetic map to be successful, the spacing of the molecular markers has to be less than the size of the contigs so that at least one marker is present on each of the contigs. Alternatively, the BACs can be directly mapped onto the chro- mosomes with fluorescent in situ hybridization (FISH). Download 1.13 Mb. Do'stlaringiz bilan baham: |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling