Practical exercise unit 7 Topic : MapReduce program in Java Work purpose : Display function, convolution function and the assignment perform the code study

Download 398.17 Kb.

bet	4/12
Sana	26.03.2023
Hajmi	398.17 Kb.
	#1296788

1 2 3 4 5 6 7 8 9 ... 12

Bog'liq
2-deadline

WHY ARE HDFS BLOCKS SO BIG?
HDFS blocks significant level more disk blocks . This is the location determination operations the number reduce for done is increased . Enough big block volume with from disk information transmission time of the block to the beginning location from time much longer to be can _ So so many _ from the blocks consists of big the file transmission time information transmission speed with is determined .
Simple calculations that's it shows that if the location determination time - about 10 milliseconds if and transmission if the speed is 100 MB / s placing time transmission to be 1% of the time for block volume to be about 100 MB need _ To multiple HDFS installations despite the standard 64 MB block the size has increased to 128 MB . new generation on disks information transmission speed increase with block volume to expect that it will increase can _
However , the block of size increase with you very away you don't go need _ Mapping in MapReduce duties usually one block with works , so for less numerous tasks ( in the cluster less nodes ) with your your work possible from being according to Slower works _
Scattered file in the system blocks abstraction to do one how much advantages have _ First benefit the most clear : file on the network each how one from disk bigger to be can _ File blocks one on disk storage a must not _ they are in the cluster each how from disks their uses can _ From this except , the file in the HDFS cluster in storage his blocks of the cluster all disks across distribution possible ( although this condition a little atypical though ).
Second , the file instead of abstract unity as from the block use storage lower system simplifies . Simplicity each how of the system dream done is a feature , but it is different different Fault methods with distributed in systems especially important _ Save lower in the system from the blocks use to keep simplifies ( because blocks defined to size have , system given on disk storage possible has been blocks the number easily counting output can ) and metadata with depends problems eliminate does ( block just of information one is part of storage for intended ; file metadata storage need - let's say , other system by separately again processing possible has been access permits about data ).
From this except , blocks replication mechanism good suitable comes - they of the system to failures durability and existence improves . It's bad from the blocks and disk/ machine from malfunctions protection to do for each one block physical in terms of a few isolated in cars ( usually three ) is repeated . If block ¬ exists if not , his a copy another from the place to the customer transparent way is read . of information violation or in hardware Fault because of there is didn't happen block replication coefficient to normal level deliver for alternative from memories another healthy to cars will be moved . ( Data from violation protection to do about addition information get for 125 - on p Information integrity to the department see ) . From this except some _ applications cluster inside reading of the load distribution improve for often to be asked file blocks for high replication coefficient installation can _

Download 398.17 Kb.

Do'stlaringiz bilan baham:

1 2 3 4 5 6 7 8 9 ... 12