You must understand that Big Data is not primarily a phenomenon of volumetric data, it is a social phenomenon. This is the visible part of the world’s transition from the industrial age to the digital era…
Do you think that Hadoop will be totally abandoned in the future? Do you think that Spark will replace Hadoop at term? Will Hadoop one day be overwhelmed … These are a lot of questions around Big Data that are increasingly asked, by professionals, who fear that they have specialized in bad technology, by journalists, who fear that Hadoop will eventually become fashionable, and by the students, who wonder if it’s really interesting to follow an academic curriculum on Big Data.
In reality, these questions hide something deeper still, that of knowing if after more than five years, the phenomenon of Big Data was finally a fashion. We decided to answer this question once and for all in this article. At first glance, we are saying that Big Data opportunities are real and that Hadoop will become the standard platform for data processing in Big Data. You will understand why by reading this article.
First, you need to understand what Big Data really is. Today, it is perceived by everyone as the explosion of data, the phenomenal size of the volume of data produced by the activities of digital.
The well-known definition given to it is that of 3V, Volume, Variety and Velocity of the data. Moreover, this volumetric perception of Big Data is associated with the following definition: “Structured or unstructured data whose very large volume requires adapted analysis tools”.
Unfortunately, designing Big Data in a purely volume way is minimizing the economic potential of data for a company and limiting its perception of the digital transition that is underway.
You must understand that Big Data is not primarily a phenomenon of volumetric data, it is a social phenomenon. This is the visible part of the world’s transition from the industrial age to the digital age. Please, read this sentence again. Big Data is the visible part of the transition from the industrial economy to the digital economy. This transition began in 1989 with the fall of the Berlin Wall according to some historians.
It comes mainly from the combination of two factors:
- the provision of the Internet in the hands of the general public
- the increase in the number of people connected to the Internet
Indeed, the popularization of the Internet has led to the digitization of business activities, which occurred at the same time as the increase in the number of people connected to the Internet through smartphones and other gadgets.
Today, there is more that through smartphones that users are connected to the Internet, they are also through vehicles (we talk about connected vehicles), their home (connected home), and right now. All these digital activities generate data.
Combine all these activities with the number of Internet users (or objects) connected to the Internet and you will find yourself very quickly with an unprecedented amount of data.
Traditionally, the data management technology approach has been to centralize storage and data processing in a central database server in a client / server architecture. Unfortunately, today, the Big Data growth scale far exceeds the reasonable capacity of traditional technologies, or even the typical hardware configuration that supports access to this data.
The new technological approach is to distribute the data storage and parallelize their processing on the nodes or machines of a cluster. Hadoop is today the most mature software implementation of this approach.
Hadoop will become the standard platform for data processing much like Excel has become since the 90s until today. What are we basing ourselves on to make such an affirmation? Simple!
In the industrial economy, the opportunity was related to the size of the market. Demand was relatively stable. All that was needed to identify an opportunity was to find a need that was still unsatisfied and to estimate whether the size of the market was sufficient to cover the costs to be incurred. In the digital economy, this is not necessarily the case.
Technology plays a very important role. It profoundly changes consumer behavior and continually redefines demand. For example, when the car was introduced for the first time in the market, it was considered a luxury, but over time, when it was successfully produced on a large scale using mass production techniques it has quickly become a convenient commodity that has profoundly changed our conceptions of mobility and is now perceived as a necessity. The same thing, phones a while ago, were not part of our lives. Today, with technological evolution, they have become indispensable to modern life.
By introducing new products into the market that become the basis of a new lifestyle, technology creates needs that did not exist before, or at least were not perceived as such. Thus, it is in the technology that is the opportunity in the digital age more in the demand as in the industrial era.
Technology is continually redefining what constitutes a need, which is the main determinant of demand. So, if you want to seize the opportunities in the digital, you must anticipate technologies that are likely to influence the level of demand. Moreover, has the history of humanity not always been divided according to its technological level?
Traditionally, it is admitted by prestigious authors such as Nicolas Carr that what makes a technology a competitive advantage is not its ubiquity, nor even its level of performance, but its rarity and level of complexity associated with its duplication. However, technological change reduces the costs of acquiring technology (see Gordon Moore’s law), which has the effect of trivializing or “commoditising” it, thus destroying the competitive advantage that could emerge there. Even the latest technologies are quickly becoming accessible.
It is through this force of technological evolution that vehicles, telephones, photocopiers, computers and even aeronautical products end up quickly becoming necessity and consumer products. Problem, with the multiplicity of technologies developed each year, how to identify the technology that constitutes an opportunity and that is likely to upset the behavior of the consumer?
In the Digital Age, an opportunity is detected by looking at the sector of the economy in which the technology or practices used are less effective than the technological developments of the sector. In other words, what constitutes an opportunity is the technological advances that are ready to become a standard in society.
Some economists describe this type of technology as RIT (Ready to be Implemented Technology). RITs are the best method that exists in one area but for one reason or another is not yet adopted. It is a more effective technology / practice than the technology / practice in the market, but for one reason or another is not yet adopted. For example, electric cars are better in terms of environmental impact than petrol-fueled cars, but they are not adopted on a large scale because of the unavailability of electricity stations, for example. .
In this case, how do you recognize that a technology or a practice is ready to become a standard? Technology is ready to become a standard if it is transparent to the user. In other words, a technology becomes a standard from the moment it does not require more skills to the user than the technology it will replace. It is this principle of transparency to the user that explains Metcalfe’s famous law that “the value of a technology is proportional to the square of the number of people who use it. ”. This is also why the success of a technology does not depend on developers or specialized users, but on business users.
In terms of data management, SQL is today a very convenient language and a skill possessed by any business analyst worthy of the name. In addition, most business operating systems (e.g., Business Objects, Oracle, SAS, Tableau, SAP, Genesys Info Mart, etc.) run on SQL. Thus, a data management technology, as powerful as it is, will never become a standard if it is not fully integrated with SQL.
In addition to being mature and stable, Hadoop is one of the few Big Data technology platforms fully integrated with SQL and in a few years will not require more skill than SQL to be exploited. This is why we can say with confidence that Hadoop will not be abandoned in the future; on the contrary, its standardization is just beginning. Spark understood this concept of transparency to the user, but is not yet mature enough to replace Hadoop. So be fearless and do not be destabilized, the opportunities of Hadoop and Big Data are indeed real despite the media hype.