There’s a lot of data out there. So much so that we now use the word ‘zettabytes’ to describe the amount. Last year, the world generated around 94 zettabytes or 94 billion terabytes of data. (Source: Finances Online)
But is all data the same? Understanding the different types of data can help us to manage it effectively and focus on the data that holds the best insights for innovation.
There are two main types of data, structured and unstructured. Did you know that only 20% of data is considered structured? The rest of it is stored as unstructured data. Knowing the difference is an important part of using data to effectively produce insights.
What is Structured Data?
Data that is predefined. This definition is what makes it consistent when read by a computer. Think of a nice-looking spreadsheet with names, addresses, and phone numbers of family members. Each row would represent a person and each column would have the exact information you would expect it to have based on the column header.
In the survey world, data looks like a pick-one question with predefined answers to pick from. For example, picking from a list when asked what state you live in.
While structured data make things easier to aggregate, change, and analyze, in some scenarios it can pose limitations for evolving businesses.
By only considering structured data, you’re only getting the what. The best innovation comes from knowing your consumers so well that you’re able to tell a story and frame problems with solutions.
What is Unstructured Data?
As the name suggests, unstructured data is any form of data that is stored in a native form instead of standardized spreadsheets or predefined models. It’s raw.
Unstandardized data often contains more text than figures or numbers, making it difficult to standardize across multiple sources.
Back to our example survey, you may ask people to tell you how their recent experience was buying a product online. Other examples of unstructured data include content from social media posts, emails, chat records and web content. Unstructured data goes beyond text and includes other formats such as video recordings, audio recordings, photos, and more.
No wonder why unstructured data makes up 80% of the world’s data.
Unstructured data can be useful for businesses and can help craft a great story to help make an informed business decision, but only if it is properly leveraged. It’s much easier for people to take unstructured data out of content and for businesses to do a poor job managing it.
What Exactly is Semi-Structured Data?
To complicate things, a third type of data has emerged called semi-structured.
A combination of both, structured data and unstructured data elements are known as semi-structured data. However, it often falls under the unstructured data umbrella. While some parts of semi-structured data are readable and easy to analyze, those elements do not provide the full value.
HTML code is the best example of semi-structured data. While it is organized in a structured way, HTML code contains defined tags and elements that often pose a challenge for the database to make sense of it.
Why are Structured and Unstructured Data Types Important for Modern, Innovative Businesses?
Both data types are invaluable assets in understanding what customers are doing, what they think, feel, and prefer. Leveraging the what (structured data) and the why (unstructured data) in effective, ethical ways is critical for today’s insight seekers and data storytellers. What will separate the best from the rest will ultimately be those individuals and businesses that continue to find innovative solutions to the ever-evolving data problems.