Big Data: revisiting specific concepts and technologies
Big Data was the term created to name an increasingly present phenomenon in organizations’ daily lives: the accelerated growth of the volume of data collected in different formats from a variety of sources usually automatically.
At the beginning of the last decade, a definition of the aspects that characterize Big Data was widely accepted and widespread: the 3Vs concept – volume, speed and variety. Each aspect was defined and related to the data generated and stored. Volume refers to large quantity; speed to the fast pace of generation and storage; and variety to different formats (structured, semi-structured, unstructured). This concept is accepted until now as a way to define and characterize Big Data.
In response to the challenges posed by the “3Vs”, new technologies had to be developed, because in many cases what had been used in previous decades for data processing, storage, retrieval and analysis did not meet the new demands satisfactorily. It is, in fact, a set of technologies that emerged to address a new technological, sociocultural and market scenario.
The technologies that enable Big Data were initially developed by major Internet players, especially search engines such as Google and Yahoo. These companies were daily challenged to monitor the exponential growth of the number of pages and their content across the world. It was necessary to develop a technological framework that would support this growth productively, reliably and economically.
New concepts in the area of database management, file servers and development and architecture models were researched, studied and tested. The ability to distribute processing, horizontal scalability, flexibility to interact with large and growing databases, among other aspects, were needs that drove the development of technologies that support the Big Data concept. This technology base was developed predominantly with open source technologies, which gave a great flexibility and speed in the advances and innovations. It is a set of specific and widely tested technologies such as Hadoop, Map Reduce, Apache HBase, among others. Such technologies are at the base layer of much of Big Data’s successful projects today.
Big Data is inserted in the context of what the business environment has called “digital transformation”, that is, the organizational changes resulting from the use of the Internet as a means of relationship between the agents of the business environment, be it clients, suppliers, public agencies , society, taxpayers and other agents. In this scenario, several communication channels are increasingly used, which incurs the generation of data for each interaction with different audiences, at a speed never seen before.
In the next post of this series, we will analyze the sociocultural and technological aspects that are behind the recent changes in consumer behavior, which have led companies and organizations of various types to reformulate their strategies considering the digital transformation, and, in this context, to adopt the concept and technologies of Big Data in their environments.