Problems to store, transfer and process the Big Data


Download 0.96 Mb.
Sana24.12.2022
Hajmi0.96 Mb.
#1061891
Bog'liq
Big data

Problems to store, transfer and process the Big Data

Course: Modern Data storage technology

course moderato: Dr. alexander Shkrebets

Date: 14.12.2022


Introduced by: Karam Mahfod
Group:K42105c
Id:336111
Mail:Karammah1235@gmail.com

Content


Introduction
BIG DATA CHARACTERISTICS
Big Data Problems:
    • storage
    • Transfer
    • Processing

Big data in the cloud
    • Advantages
    • disadvantefes

conclusion

Introduction

  • Big Data is a term used to describe the large amount of data in the networked, digitized, sensor-laden, information-driven world. The growth of data is outpacing scientific and technological advances in data analytics. Opportunities exist with Big Data to address the volume, velocity and variety of data through new scalable architectures.

BIG DATA CHARACTERISTICS

Big Data Problems

  • Big Data Challenges include the best way of handling the numerous amount of data that involves the process of storing, analyzing the huge set of information on various data stores. There are various major challenges that come into the way while dealing with Big Data which need to be taken care of .

Big Data Problems

  • Solution
  • The most suitable solution to Big Data storage is the usage of hyperscale computing environments that can be extended and be flexible based on our needs.


Storage

Big Data Problems

  • Solution:
  • First, process the data “in place” and transmit only the resulting information. In other words, “bring the code to the data”, vs. the traditional method of “bring the data to the code.” Second, perform triage on the data and transmit only that data which is critical to downstream analysis. In either case, integrity and provenance metadata should be transmitted along with the actual data.


Transfer

Big Data Problems

Processing big data is a major challenge, perhaps more so than the storage or management problem.

Assume that an exabyte of data needs to be processed in its entirety. For simplicity, assume the data is chunked into blocks of 8 words, so 1 exabyte = 1000 petabytes. Assuming a processor expends 100 instructions on one block at 5 gigahertz, the time required for end-to-end processing would be 20 nanoseconds.

To process 1K petabytes would require a total end-to-end processing time of roughly 635 years. Thus, effective processing of exabytes of data will require extensive parallel processing and new analytics algorithms in order to provide timely and actionable information.

Processing

Processing solution

Big Data does not come from space, there is should a source producing this data.

big data in the cloud

  • Big data involves manipulating petabytes (and perhaps soon, exabytes and zettabytes) of data, and the cloud’s scalable environment makes it possible to deploy data-intensive applications that power business analytics. The cloud also simplifies connectivity and collaboration within an organization, which gives more employees access to relevant analytics and streamlines data sharing.
  • Large volumes of both structured and unstructured data requires increased processing power, storage, and more. The cloud provides not only readily-available infrastructure, but also the ability to scale this infrastructure really quickly so you can manage large spikes in traffic or usage.

Advantages

big data in the cloud

Migrating big data to the cloud presents various hurdles.

  • Less control over security
  • These large datasets often contain sensitive information such as individuals’ addresses, credit card details, social security numbers, and other personal information.

  • Network dependency and latency issues
  • The flipside of having easy connectivity to data in the cloud is that availability of the data is highly reliant on network connection. This dependence on the internet means that the system could be prone to service interruptions. In addition, the issue of latency in the cloud environment could well come into play given the volume of data that’s being transferred, analyzed, and processed at any given time.

Disadvantages

conclusion

For now, managing big data is somewhat feasible, but predictions about the future state that we are producing many new algorithms and technologies in order to deal with the massive amounts of data generated daily.

References

  • Mr. Manish Srivastava, Mr. Jitendra Kumar “THE IMPORTANCE OF BIG DATA” international journal of science technology and management vol.5, issue.8 August 2016.
  • NIST Big Data Public Working Group Definitions and Taxonomies Subgroup Information Technology Laboratory National Institute of Standards and Technology Gaithersburg, MD 20899,October 2019.
  • https://bigdatawg.nist.gov/home.php
  • Stephen Kaisler; Frank Armour; J. Alberto Espinosa; William Money, “Big Data: Issues and Challenges Moving Forward”, 2013 46th Hawaii International Conference on System Sciences
  • Peter Mell (NIST), Tim Grance (NIST), “The NIST Definition of Cloud Computing” September 2011.
  • Pedro Caldeira Neves, Bradley Schmerl , Jorge Bernardino and Javier Cámara, “Big Data in Cloud Computing: features and issues”, International Conference on Internet of Things and Big Data, January 2016

Download 0.96 Mb.

Do'stlaringiz bilan baham:




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling