Keywords: What is BIG Data, Hadoop, Spark. Difference between Structured and Unstructured data. Difference between Hadoop and BIG Data, Examples of BIG Data, Structured, Unstructured data.

What is BIG Data ? – PART I

Before knowing about BIG Data, we should understand about STRUCTURED DATA and UNSTRUCTURED DATA.

What is Structures Data?

Any data that is organized in COLUMNS and ROWS is STRUCTURED DATA. Irrespective of technology where the data is physically stored, as long as it is in columns and rows then we refer it as Structured Data. Structured data can be physically stored in RDBMS databases, excel files, text files, in memory databases, cloud server or in any other technologies.

Examples:

All the transactional data is structured data. In general transactional data created by systems such as ERP, CRM, Online Transactional Systems, e-Commerce and others.

What is Unstructured Data?

Any Data that is not organized in the form of Columns and Rows is UNSTRUCTURED DATA.

Examples:

Word documents, pdf documents, email content, images, audio files, video files and others.
Data created in social networking website such as Facebook, LinkedIn, Twitter, Whatsapp and others.
Data created by user search in search engines such as Google, Bing, Yahoo and others.

BIG Data?

All the UNSTRUCTURED DATA is referred as BIG Data, because more than 80% world data in Unstructured data.

What is Hadoop and Spark ?

Hadoop and Spark are frame works to prepared Structured data from Unstructured data.
There are people who are referring BIG Data as Hadoop and Hadoop as BIG Data – please don’t, I hope now you can understand the difference.
Let me add one more comment – We can process Unstructured Data (BIG Data) in to Structured Data without Hadoop and Spark. We can write simple structured program or object oriented program to convert Unstructured Data (BIG Data) in to Structured Data.

BLOGS INDEX

What is the scope for Pegasystems career and jobs ? Whether to learn Pegasystems technologies for career advantage or not? It is a tricky question to answer !.

Informatica Big Data Management Editions - The BIG DATA and Data Science Game changer !

Tableau 10.1 Beta version released...

What is NoSQL and the difference between SQL and NoSQL ?

New Features of Tableau 10 Desktop

Tableau 10 Beta version released...

Difference between Tableau, QlikView, MSBI and PowerBI – PART I

What is Business Intelligence and Analytics Platform ?

What is BIG Data ? - Part I

Difference between Apache Hadoop and Apache Spark – PART I