Ma’lumotlar kommunikatsiyasi” fanidan tayyorlagan mustaqil ishi topshirdi: Qabul qildi: O. I. Ergashev Farg‘ona 2023 reja
Download 323.47 Kb.
|
AXBOROT TEXNOLOGIYALARI VA KOMMUNIKATSIYALARNI
- Bu sahifa navigatsiya:
- FOYDALANILGAN ADABIYOTLAR 1 https://russianblogs.com/article/90892883010/ 2 https://en.wikipedia.org/wiki/MapReduce
4 SIQISH JOYINI TANLASH
Rasm 1.3 Siqishni MapReduce rolining istalgan bosqichida yoqish mumkin: 5 Siqish parametrining konfiguratsiyasi. Rasm 5 Hadoop-da siqishni yoqish uchun siz quyidagi sozlamalarni sozlashingiz mumkin: 6 Siqish amaliyoti holatlari 1 Ma'lumotlar oqimini siqish va dekompressiya Compression Codec-da ma'lumotlarni siqish yoki dekompressiya qilish qulayligi uchun ishlatilishi mumkin bo'lgan ikkita usul mavjud. Chiqish oqimiga yozilgan ma'lumotlarni siqish uchun biz createoutputstream (outputStreamout) usulidan foydalanib, compressionoutputstream-ni siqilgan formatda asosiy oqimga yozishimiz mumkin. Buning o'rniga, agar siz kirish oqimidan ma'lumotlarni ochmoqchi bo'lsangiz, CompressionInputStream bilan siqishni olish uchun CreateInputStream (inputStream) funktsiyasini chaqiring va shu bilan pastki oqimdan noqulay ma'lumotlarni o'qing. Quyidagi siqish usullarini tekshiring: package com.until.mapreduce.compress; import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.FileOutputStream; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IOUtils; import org.apache.hadoop.io.compress.CompressionCodec; import org.apache.hadoop.io.compress.CompressionCodecFactory; import org.apache.hadoop.io.compress.CompressionInputStream; import org.apache.hadoop.io.compress.CompressionOutputStream; import org.apache.hadoop.util.ReflectionUtils; public class TestCompress { public static void main(String[] args) throws Exception {
// 1, сжатие private static void compress(String filename, String method) throws Exception { // (1) Получить входной поток FileInputStream fis = new FileInputStream(new File(filename)); Class codecClass = Class.forName(method); CompressionCodec codec = (CompressionCodec) ReflectionUtils.newInstance(codecClass, new Configuration()); // (2) Получить выходной поток FileOutputStream fos = new FileOutputStream(new File(filename + codec.getDefaultExtension())); CompressionOutputStream cos = codec.createOutputStream(fos); // (3) пытка потока IOUtils.copyBytes(fis, cos, 1024*1024*5, false); // (4) Закрыть ресурсы cos.close(); fos.close(); fis.close(); } // 2, разложение private static void decompress(String filename) throws FileNotFoundException, IOException { // (0) проверить, может ли он распаковать CompressionCodecFactory factory = new CompressionCodecFactory(new Configuration()); CompressionCodec codec = factory.getCodec(new Path(filename)); if (codec == null) { System.out.println("cannot find codec for file " + filename); return; } // (1) Получить входной поток CompressionInputStream cis = codec.createInputStream(new FileInputStream(new File(filename))); // (2) Получить выходной поток FileOutputStream fos = new FileOutputStream(new File(filename + ".decoded")); // (3) пытка потока IOUtils.copyBytes(cis, fos, 1024*1024*5, false); // (4) Закрыть ресурсы cis.close(); fos.close(); } } 2 Chiqish kartasi siqishni ishlatiladi MapReduce kirish va chiqish fayllari tayyor bo'lmagan fayllar bo'lsa ham, siz map vazifasining oraliq chiqish natijasini siqib qo'yishingiz mumkin, chunki u qattiq diskka yozilishi va tarmoq orqali kamaytirish tuguniga uzatilishi kerak. Siqish ishlashni sezilarli darajada yaxshilashi mumkin. Ushbu vazifalar ikkita atributni o'rnatish uchun o'rnatilgan bo'lsa-da, kod qanday o'rnatilganligini ko'rib chiqamiz. 2.1 Taqdim etilgan Hadoop manba kodi tomonidan qo'llab-quvvatlanadigan siqishni formati: Bzip2codec, DefaultCodec package com.until.mapreduce; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.io.compress.BZip2Codec; import org.apache.hadoop.io.compress.CompressionCodec; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import java.io.IOException; public class WordcountDriver {
Configuration configuration = new Configuration(); // Откройте сжатие вывода в конце карты configuration.setBoolean("mapreduce.map.output.compress", true); // Установить метод сжатия вывода в конце карты configuration.setClass("mapreduce.map.output.compress.codec", BZip2Codec.class, CompressionCodec.class); Job job = Job.getInstance(configuration); // 2 установить путь загрузки банки job.setJarByClass(WordcountDriver.class); // 3 установить карту и уменьшить класс job.setMapperClass(WordcountMapper.class); job.setReducerClass(WordcountReducer.class); // 4 Установить выход карты job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(IntWritable.class); // 5 Установите окончательный тип выхода KV job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); // 6 Установить путь ввода и вывода FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); // 7 отправить boolean result = job.waitForCompletion(true); System.exit((result ? 0 : 1)); } } 2) WordcountMapper package com.until.mapreduce; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; import java.io.IOException; import java.nio.MappedByteBuffer; public class WordcountMapper extends Mapper Text k = new Text(); IntWritable v = new IntWritable(1); @Override protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { // 1. Получить линию String line = value.toString(); // 2. Вырезать String[] words = line.split(" "); // 3. Вывод for (String word : words) { k.set(word); context.write(k,v); } } } 3) WordcountReducer package com.until.mapreduce; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; import org.junit.Test; import java.io.IOException; public class WordcountReducer extends Reducer @Override protected void reduce(Text key, Iterable // 1. Совокупное резюме sum = 0; for (IntWritable count : values) { sum += count.get(); } // 2. Выход v.set(sum); context.write(key,v); } } 3 Chiqishni kamaytirish siqishni ishlatadi Wordcount ishini qayta ishlashga asoslangan (yuqoridagi misolni o'zgartiring) WordCountdriver sinfini o'zgartiring, Mapper va Reducer o'zgarishsiz qoladi: 4 // Установите раскрытие сжатия сжатия конечного вывода FileOutputFormat.setCompressOutput(job, true); // Установить метод сжатия FileOutputFormat.setOutputCompressorClass(job, BZip2Codec.class); Xulosa Men Ma’lumotlar kommunikatsiyasi fanidan MapReduce-da siqishni ishlatish mustaqil ishini bajarish jarayonida yani bilim va ko’nikmalarga ega bo’ldim. Menga MapReduce da siqish turlarining ko’pligi, qulayligi va tezligi juda xam yoqdi. Kelajakda bu bilim va ko’nikmalarni tatbiq qilishga harakat qilaman FOYDALANILGAN ADABIYOTLAR 1 https://russianblogs.com/article/90892883010/ 2 https://en.wikipedia.org/wiki/MapReduce 3 https://ru.zahn-info-portal.de/wiki/MapReduce Download 323.47 Kb. Do'stlaringiz bilan baham: |
ma'muriyatiga murojaat qiling