Unstructured Data is data that does not reside in established fields within a record. Examples of unstructured data include emails, videos, photos, web pages, and many forms of business documents. Unstructured data is distinguished from structured data, which has a predefined data model. A great deal of unstructured data is paired with structured data to provide some organization. These pairings can include metadata, filing systems, or some structured data fields that surround unstructured text.
Diffbot’s web extraction products pull unstructured data from web pages and transform this data into structured semantic entities. Utilization of cutting-edge natural language processing, machine vision, and machine learning to optimize knowledge graph ontologies allow Diffbot to truly turn the unstructured web into a structured semantic data source.
See also structured data, relation extraction, DIKW Pyramid.