What Is Unstructured Data?
Every piece of data that is not structured data can be classified as unstructured data. It’s estimated that by 2025, 80% of the data we encounter will be unstructured data in the form of text, audio, image, or video1.
In short, unstructured data is modern data. It’s often:
- Born digital and unpredictable
- Always being created and on the move
- Blended, multimodal, and interoperable
- Geo-distributed for better protection
Unstructured data can have some associated metadata that can, in turn, have a structure. For example, a video can have metadata of video resolution, bit rate, frames per second (FPS), owner of the video, etc. But the video itself is unstructured. When there’s some structured metadata associated with unstructured data, it’s occasionally referred to as semi-structured data.
Looking more closely at the example of a YouTube video, some metadata is present, such as the time of upload, date of upload, number of views (partial or full), number of likes and dislikes, etc. But the content inside the video title, the video description, and the video itself is unstructured. It has a qualitative aspect that cannot be captured purely by numbers.
The most commonly used database for unstructured data is NoSQL. NoSQL stands for “not only SQL,” indicating that the database can handle a wider range of data beyond the capabilities of SQL databases. There’s no schema or tabular structure for NoSQL databases; it’s just a collection of data grouped together.
Unstructured Data Storage with UFFO
That said, although unstructured data may be able to provide significant insight with huge transformative potential, there are challenges with wrangling it. Pure’s advanced UFFO storage solution, Pure Storage® FlashBlade®, offers the speed associated with flash storage technology, as well as the ability to scale any architecture in an agile fashion. Want to take a closer look? Pure offers a free trial for Pure FlashBlade so that you can test drive the solution with no commitment.