Big Data, the new buzz word in the today’s technology is gaining more importance due to its high rewards. A systematic and focused approach toward the adoption of Big Data allows one to derive maximum value and utilize the power of Big Data.
Its nothing but a new framework or system to get insight of existing different data forms and increasing the researchers/analyst power to get more out of existing system.
As BG Univ says, “Big data is about the application of new tools to do MORE analytic on MORE data for More people.”
Lifecycle of data can be defined as :
People get confuse with Big Data & Hadoop as 2 similar things. But no, Big data is not only Hadoop.
Big Data is not a tool or single technique. Its actually a platform or a framework having various components like Data Warehouses (providing OLAP data/History), Real time Data systems and Hadoop (provides insight to structured/semi or unstructured Data).
Examples of Big Data are like Traffic data, Flights Data/ Search engine data etc.
Thus Big Data includes huge volume, high velocity, and extensible variety of data. The data in it will be of three types :
a) Structured data: Relational data.
b) Semi Structured data: XML data.
c) Unstructured data: Word, PDF, Text, Media Logs.
Big Data can be characterized by 3 V’s :
1) Velocity -> Batch processing data, real time
2) Variety-> Structured, semi-structured, unstructured and polymorphic data
3) Volume-> Terabytes to Petabytes
Big Data puts existing traditional systems into trouble due to many reasons because when data increases the complexity, Security, maintenance, processing time of it also increases. Big Data gets Distributed processing system into picture. Its using multiple system/disk for parallel processing.
There are various tools & technologies in the market from different vendors including IBM, Microsoft, etc., to handle big data. Few of them are: