The DIKW Pyramid — also commonly known as the information or data pyramid — is a series of related models detailing the relationship between data, information, knowledge, and wisdom. The central observation shared through these models is that each level of the hierarchy is achieved by properly utilizing elements from the next lower level of the pyramid. In short, information is commonly talked about in terms of data, knowledge is commonly talked about in terms of information, and wisdom is achieved through properly utilized knowledge.
A second aspect of the DIKW pyramid involves the characterization of the constituent parts. Data is typified as simple signals or signs. On it’s own data does not mean anything. An example is a single letter or an individual binary value. Information is inferred from data and is differentiated in that it is “useful.” Descriptions of the world around us that utilize data are information. Knowledge — while harder to objectively pin down — is commonly thought of as the properly utilized information, or a mixture of experience, judgement, successful processes, and information. Wisdom is harder to pin down yet. And is often ignored in information science applications of the DIKW pyramid.
Diffbot’s offerings differ from many data extraction platforms through the fact that knowledge graphs synthesize data and information from many sources. Contextually-linked entities within the Knowledge Graph™are pushed through a knowledge fusion process that weighs competing claims for validity from sources across the public web. This is a primary difference between data extraction and knowledge-as-a-service providers.
One of the primary differences between Diffbot’s web data offerings and those of competitors include the fact that data is not simply “extracted.” But rather, data provided by Diffbot is parsed into entities that represent people, products, organizations, articles, and more. This process — aided by cutting-edge machine learning, natural language processing, and machine vision — represents a transition from web data provider services to knowledge-as-a-service. With the latter, previously unstructured data is not only structured but arranged by “experienced” machine learning to provide information tailored to many highly-honed use cases.
To illustrate where Diffbot’s web data-sourced offerings fall within the DIKW pyramid, let’s take a look at examples different web “data” providers ad see where their output places them within the pyramid.
Web data providers provide:
- Raw, unprocessed symbols or materials – ie, all PDF files on a site, text fields from every page of a website, or the number of links to a given domain.
Information providers (many who bill themselves out as “web data providers” instead) provide:
- Data structured into entries. Perhaps a pricing page placed into a table with columns for product names, ratings, and prices.
Knowledge-as-a-service providers provide:
- Previously unstructured information organized in a manner that provides unusual insight or comprehensiveness. For example, an extracted pricing page placed within a product entity, which is in turn placed as a constituent of an organization or brand page. Semantic output that can be filtered, searched, and faceted across entity types.
See also data provenance , explainability and tranparency in AI, Knowledge Graph Entity Types.