Beginner’s Guide To Big Data


Big Data is a popular term to describe a large volume of data with great variety (many different kinds of data) and velocity (fast-changing data). This is known as the three Vs (volume, velocity, variety) according to the definition by Gartner

With the large amount of 3Vs, Big Data is very hard—if not impossible—to manage with traditional data processing, and different approaches and tools are necessary. 

With the amount of data available in the market, and with how competitive the market is nowadays, effective Big Data management is now a very important factor in winning the competition. 

It’s important, however, to understand that big data management is not solely about the amount of data. How the business manage, analyze and use the available data are far more important.

More About The 3V’s of Big Data

As mentioned, Doug Laney of Gartner in the early 2000s used the concept of three Vs to define the behavior of Big Data. As Big Data management will include creation, storage, and retrieval of data, all three of these V’s will play their respective part:

1.Data Volume

The ‘Big’ in Big Data already suggested the huge amount—volume— of data that can include, but not limited to social media data, transactional data, investment information, and data related to business deals.

It’s important, however, to note that volume alone doesn’t determine whether we can categorize a data as Big Data, as it also has to possess high velocity and variety.

Volume however, is an important parameter to consider, as it will ultimately determine how and with what tools we can extract value from the data.

2. Velocity

The second V, velocity refers to how fast the data is generated and processed. High velocity is a strong indicator of the data’s usability—or potential usability.

In a nutshell, big data velocity is how quickly an unprecedented data flows in, and from what source(s). Obviously the biggest change in velocity that drives the conception in big data is the increasing speed of the internet, and especially the amount of data circulating in social media. 

Millions of hours of video are uploaded on YouTube every single day, and hundreds of millions of Twitter posts are updated daily. Business transactions and investments are done online—faster than ever.

How we can manage and process this huge amount of unprecedented data coming at high speed will be crucial to determine the actual usability of the data.

3. Variety

In this postmodern era, we have more variety of data than just the plain old structured data we used to get in excel sheets and databases. Today, various other forms of data from audio and video to financial transactions to unstructured numeric data are the norm, and all these types of data will require different approaches from preprocessing to  accurately deciphering the data to extract its actual value according to context.

Different types of data will also mean a variety of ways to collect, store, and analyze each type. Faster processing process is required while it also need to maintain accuracy in extraction.

The concept of 3V’s has been expanded in recent years to include more V’s. However, the three above remain the most important and relevant. 

The Different Types of Big Data

While there can be many varieties in data—as we have briefly discussed above—, we can divide big data into just three major categories:

Structured Big Data

Which is, any data that is stored and retrieved in a fixed, structured format, mainly in databases with an organized manner.

The key factor here is that the format of the data here is usually known, and so we already know how to decipher value from it. 

For example, the customer database is structured in an organized database along with other important variables like their spendings, key behaviors, etc.

Unstructured Big Data

Unstructured data refers to any forms of data without any specific structure and/or format. Due to the big volume, deriving value from unstructured big data can be really difficult and might need a dedicated tool or even hardware.

Common examples of big unstructured data is social media streams, emails, and Google search results.

Semi-structured Big Data

Semi-structured data contains the characteristics of both forms of data above. In most cases, semi-structured data is defined in known format, but not completely structured. The data contains important information that differentiate individual elements, but not yet classified properly under a database.

Any data stored in an XML file can be considered as semi-structured.

Why We Need Big Data Management?

Why is Big Data suddenly the buzzword of the business world for the past half decade or so? Here are a few important reasons why:

  • Better insights on customer behaviors

Customers now spend most of their time online, especially on social media. Harnessing and analyzing data from social media platforms, Google search results, YouTube viewing data, and other online sources can give us better insights about customer behaviors, and allow us to develop a better marketing strategy to cater to these customers. 

By better understanding our customers’ behaviors, needs, problems, and pain points, we can also develop better products and services.  

  • Predictive analytics

One of the key advantages of managing and deriving value from big data is predictive analysis. 

Big data analytics can provide accurate suggestions and predict accurate outcomes, allowing businesses, managers, and executives to make better decisions. This will also allow more optimal risk management, more efficient operations, and better cost-efficiency.

  • Accurate derivation of value

Big data essentially combines a variety of relevant data from various sources and derive an accurate value from these different sources. This will allow companies to filter out irrelevant data and extract only those that matter, as well as to find correlations between different data to find insights.

Extracting useful data from a pool of irrelevant data can be extremely time-consuming and expensive, and proper Big Data management can help mitigate this.

  • Better insights for lead generation

Lead generation is often the biggest challenge for any marketers and businesses, and Big Data analytics can provide useful insights and discovery of new lead sources, which can lead to a more effective lead generation.

Big Data analytics tools can help businesses to understand customers’ behaviors—and how to capture their attention—, as well as understanding how our product/service and our current marketing efforts are performing. Big Data can help us to better allocate our budget on lead generation channels that matter.

  • Competitive advantage

Various insights gained through Big Data analytics and management will provide a chance to build a more efficient operations and more effective marketing efforts to stay ahead of our competitors.

Big Data will also allow us to better understand our competitors and their approaches by collecting and analyzing  various data streams from different sources.

How Does Big Data Work?

The main principle of Big Data management is that the more knowledge we have about a subject, the better we can gather more knowledge about that subject, and the better we can predict the subject’s future.

Big Data analysis collect, analyze and compare data from different sources to find key correlations and derive value. This is commonly done by building and running a simulation model based on the different data variables. Various adjustments and readjustments of data value are made with each simulation, and all the results are monitored.

With today’s big data analytic tools, we can run millions of this data simulations automatically and tweak all the possible data points with the goal to find patterns and relationships between variables. 

As mentioned above, data especially coming from the modern online sources are usually unstructured: it’s difficult to define the different data variables (especially due to their volume), and they didn’t come in a structured database.For example, images and videos from Facebook, together with recorded phone calls are unstructured. 

To tackle this issue, an advanced analytics tool—commonly using artificial intelligence’s machine learning implementations— is required for three things:

  • Identify the data’s format, for example through image recognition or voice recognition (natural language processing)
  • Make calculations and analytics quickly and in real time, and run millions of simulations accurately
  • Find patterns by comparing results, and extract value from the group of data

 As this analytics tool will require an advanced technology and processing speed/power, a common practice nowadays is to rent the required tools and processing space from SaaS platforms. Businesses rent the software and server space from third-party SaaS providers. 

The availability of these SaaS Big Data analytics services has paved the way for more Big Data transformations in many businesses, as now the technology is more accessible and affordable for even smaller companies.

Common Big Data Applications

With all the benefits of Big Data management and applications we have discussed above, more businesses and industries are looking for ways to incorporate Big Data in their operations.

Here are some examples of common Big Data implementations:

  • Health Industry

Big Data implementations in health and medical industry allow us to analyze and compare different variables included in medical records. This will provide us with more insights in recognizing symptoms earlier than before and in developing new medicines.

  • Geology and Climate

Various sensors can collect data and Big Data analysis can better predict when and where earthquakes and tornadoes are going to strike. Big Data analytics can also analyze patterns of human behavior, giving insights for better relief aid.

  • Security

Police and military are using Big Data to analyze more intelligence and public data (i.e. automatic analytics of thousands of security footage files) to develop better strategies in preventing and fighting crime (and wars). This will allow better efficiency in deploying resources, and faster actions.

  •  Space Exploration

The vast outer space includes numerous unstructured data points,and Big Data implementations have allowed us to run millions of simulations to plan future missions.

  • Agriculture

Agriculture is a field where just a minor change in one variable can significantly affect yield, and the cost—and time— required for experiments are often not viable. Thus, Big Data provides us with a better, faster, and more cost-efficient way to simulate various conditions to maximize yield while maintaining the lowest possible pollution levels caused by various agricultural machines. 

  • Lifestyle

Big Data is the technology allowing the availability of in-demand aggregation services like Uber and AirBnb, as well as better experiences in ecommerce shopping, ticket booking, and so on.

Concerns Surrounding Big Data

While Big Data has provided us with new opportunities and benefits, the concept and technology are not yet perfect, raising questions that we must answer such as:


Big Data allows companies and organizations to get access to our personal information—especially our conversations in social media—, this can cause potential issues, and in fact, several high-profile cases surrounding this have happened in recent years, such as the Cambridge Analytica Scandal.


What will happen when an organization with access to a huge volume of important information is breached? With the amount of data getting bigger, and the data transfer speed getting faster with the arrival of 5G, the demand for advanced cybersecurity will be even bigger than today. 

Data management Uncertainty

Various new and innovative data management tools and technologies are being implemented, with the aim to provide better efficiency and maximize results. However, these new approaches also create uncertainty and non-uniformity in data management.

Talent Gap

The usefulness of Big Data is largely dependent on the capabilities of the data analyst or data expert. Unless, for example, we can completely rely on A.I. in the near future to analyze data, we won’t solve this issue.

End Words

Big Data analysis and management certainly have their value, and will continue to develop in the future to provide even more benefits from cost efficiency to better decision making to saving time. 

In the near future, Big Data will no longer be a luxurious things possessed by big companies as a competitive advantage, but will be a necessity. Understanding the concept of Big Data, and find ways to start implementing it to your business today will be crucial in achieving growth.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like