What is Big Data?
Big Data describes large volumes of structured and unstructured data that are so large that they are difficult to process using traditional techniques. It can be defined as a data set whose size or type goes beyond the capabilities of conventional relational databases to capture, manage and process data with low latency.
Big Data is just like – a whole lot of data.
Big data is more significant, complex data sets, especially from new data sources. This data set is so large that traditional data processing software cannot manage it. But these enormous volumes of data can be used to solve business problems that you previously couldn’t solve.
The concept of Big Data is relatively new and represents an increase in the amount and variety of types of data that are now being collected. Big Data proponents often refer to this as the world’s “Datificatioan.” As more of the world’s information moves online and becomes digital, analysts can start using it as data. Things like social media, online books, music, videos, and an increasing number of censors have all added to the growing amount of data available for analysis. Everything done online is now stored and tracked as data. Reading a book on your Kindle yields data about what you read, when you read it, how fast you read it, and so on. Likewise, listening to music generates data about what you listen to, when, how often, and in what order. Your smartphone constantly uploads data about where you are, how fast you move, and what apps you use most of the time.
Read also : Interesting Places in Yogyakarta, Indonesia
Big Data is analyzing and extracting information from massive data sets. This term also refers to large amounts of data that grow exponentially over time. The data is so large and complex that no conventional method or traditional data management tool can process and store it effectively. There are many examples of Big Data. Organizations across industries generate and leverage data from social media platforms to E-commerce stores to improve their processes.
According to Wikipedia:
“Big data is a term that applies to the growing availability of large datasets in information technology.”
Big Data typically includes data sets beyond the capabilities of commonly used software tools to capture, curate, manage and process data within a tolerable time. The Big Data philosophy has unstructured, semi-structured, and structured data; however, the main focus is on unstructured data.
Big data “size” is a constantly moving target; in 2012, ranging from a few dozen terabytes to many zettabytes of data. In addition, Big Data requires a series of techniques and technologies with new forms of integration to reveal insights from diverse, complex, and large-scale data sets.
In general, Big Data can be described as follows:
- Data (Facts, a description of facts from the real world)
- Information (Information, results of data, and knowledge storage)
- Knowledge (Knowledge, Map of the world, and individual models)
An example is a COVID-19 pandemic.
- Data: Positive, negative, recovered cases, and patients who died (Indonesia) and PPKM (Movement Restrictions Policy)
- Information: Information on the development of the COVID-19 pandemic in Indonesia (charts, histograms)
- Knowledge: Effect of PPKM (restrictions on movement) on the rise and fall of COVID-19 cases
- Decisions: Extend/not from PPKM (Movement Restrictions Policy), accelerate vaccination, offline/still online schools, and others.
Read also : Culinary in Yogyakarta that You Must Try!
Although the concept of big data itself is relatively new, the origins of big data sets go back to the 1960s and 70s when the world of data started with the first data centers and the development of relational databases.
Around 2005, people realized how much data users generated through Facebook, YouTube, and other online services. Hadoop (an open-source framework created specifically for storing and analyzing large data sets) was developed in the same year. NoSQL also started gaining popularity during this time. The development of open-source frameworks, such as Hadoop (and, more recently, Spark), is critical to the growth of big data because they make big data easier to use and cheaper to store. In the years since the volume of big data has skyrocketed. Users still generate large amounts of data—but humans aren’t the only ones doing it.
With the advent of the Internet of Things (IoT), more objects and devices are connected to the internet, gathering data about customer usage patterns and product performance. The emergence of machine learning has generated even more data. While Big Data has come a long way, its uses are just beginning. Cloud computing has extended the possibilities of big data even further. The cloud offers truly elastic scalability, where developers can quickly spin up ad hoc clusters to test subsets of data. And Graphic databases are becoming increasingly important, too, with their ability to display large amounts of data in a way that makes analytics fast and comprehensive.
Big data will only be a collection of raw data that means nothing if it is not empowered. However, empowering ample information is problematic because it requires infrastructure, human resources, and software investments. Until now, internet-based service companies such as Google and Facebook have dominated extensive data empowerment. The data they empower is not internal company data such as sales data. These companies utilize big data to obtain per-consumer trend information by utilizing the personal attributes of each consumer. For example, Amazon uses attribute data attached to each user, purchase history, user behavior, and other data as material for making recommendations according to the characteristics of each user.
Google uses accumulated data on a vast scale to conduct advertising business. Then Facebook uses massive user data to increase profits from advertising, gaming, and software sales.
It may be helpful!!