As providers of the world’s largest commercially-available Knowledge Graph™, Diffbot is in a great position to share fundamental concepts and knowledge on what knowledge graphs are accomplishing today, and what they will likely facilitate in the future. We’re excited to offer this glossary of knowledge graph-related terms and concepts for the next generation of KG users.
Automated Data Cleaning
Automated Data Cleaning involves the application of machine learning to accomplish the data cleaning objectives of modifying or removing data…
[...]Automated Knowledge Base
Automated Knowledge Bases are large repositories of knowledge structured as entities and the relationships between them that are compiled through…
[...]Commonsense Knowledge Bases
Commonsense Knowledge Bases are knowledge bases (or knowledge graphs) organized around the representation of data that everyone is expected to…
[...]Data Archive
Data Archives retain information over time often preserving a view of a given moment within a database. As opposed to…
[...]Data Enrichment
Data Enrichment is the common practice of merging external, authoritative data with first-party customer or exploratory data. First-party “raw” customer…
[...]Data Mining Confidence
Confidence within data mining is typically utilized for association-rule learning. In the case of market-basket analysis, confidence describes the relationships…
[...]Data provenance
Data provenance (also referred to as “data lineage”) is metadata that is paired with records that details the origin, changes…
[...]Data Science Lifecycle
The Data Science Lifecycle is an iterative approach to managing data science contributions within an organization. Seven steps are commonly…
[...]DIKW Pyramid
The DIKW Pyramid — also commonly known as the information or data pyramid — is a series of related models…
[...]Discussion Data
Discussion Data is the primary fuel of discussion analysis as well as one extremely valuable source for sentiment analysis for…
[...]Entity Resolution
Entity Resolution refers to a portion of a knowledge graph build process in which information from separate records is reconciled…
[...]Facet
A Facet is an aspect or type of entry within a knowledge graph entity. Examples of facets include the location…
[...]Faceted Search
Faceted Search is a search that returns a count of the prevalence of one set of attributes within entities that…
[...]Firmographic Data
Firmographic Data — also known as firm demographic data — is data related to the fundamental characteristics of organizations. Often…
[...]Folksonomy
A Folksonomy is a categorization system in which end users categorize content or entities with the use of tags or…
[...]Graph
A graph is a mathematical concept used as a non-linear data structure within computer science. Graphs are often depicted visually,…
[...]Head Entities
Head Entities are entities within a knowledge graph that are more regularly referenced, linked to, downloaded, and utilized than other…
[...]Induction
Induction is a form of reasoning in which premises are viewed as supplying some satisfactory evidence as to the truth…
[...]Inference Engines
Inference Engines are a component of an artificial intelligence system that apply logical rules to a knowledge graph (or base)…
[...]Knowledge Base
Knowledge Bases are large repositories of structured or unstructured data for use within an information system. The term was originally…
[...]Knowledge Engineering
Knowledge Engineering is a subset of engineering methods and questions within artificial intelligence that seeks to create systems that emulate…
[...]Knowledge Fusion
Knowledge Fusion is a crucial and differentiating step within Diffbot’s Knowledge Graph™ build pipeline. Occuring after the linking of records,…
[...]Knowledge Graph Entity
Knowledge Graph Entities are people, places, or “things” as defined within a knowledge graph. Grammatically, entities tend to be nouns…
[...]Knowledge Graph Reasoner
A Knowledge Graph Reasoner — also called an inference engine, rules engine, or semantic engine — is an AI-enabled system…
[...]Linked Data
Linked Data is data that is encoded alongside it’s semantic meaning. Linked data has been championed by numerous organizations in…
[...]Long Tail
Long Tail data or entities are those that are less commonly referenced within a knowledge graph (or any data set).…
[...]Natural Language Processing
Natural Language Processing is a field of inquiry and processes concerned with the interaction of computers and human language (speech…
[...]Noise
Noise is typically thought of as unexplained variability in data. Noise is in contrast to a signal, which is clearly…
[...]Ontology
An Ontology is a set of concepts or categories within one subject matter or domain that show properties of entities…
[...]Organizational Data
Organizational Data — also called firmographic or firm demographic data — is data related to the fundamental characteristics of organizations.…
[...]Origin
An Origin is the location or source of data that is incorporated into a fact. Origins are important for automated…
[...]Overmerging
Overmerging occurs when a knowledge graph entity has too many records such that these records make data for the entity…
[...]Product Data
Product Data includes all readable, measurable, and structurable data about products. While there is no universally accepted schema for all…
[...]Properties
Properties are attributes or characteristics of knowledge graph entities. Properties vary depending on entity type and as described in a…
[...]Proxies
Proxies — also known as proxy servers — are intermediate servers that receive web requests and redirect them. Proxy servers…
[...]Record Linking
Record Linking is an important aspect of any knowledge graph build process that involves linking records to entities. An example…
[...]Relation Extraction
Relation Extraction is the process of identifying associations between elements in unstructured data. For example, recognizing that Diffbot.com is the…
[...]Schema
A Schema is a set of rules for how entities, attributes, and relationships between entities can be arranged in a…
[...]Seed URL
A Seed URL in web crawling is a url from which a web crawler will begin to traverse a site.…
[...]Semantic Integration
Semantic Integration is the process of integrating information from diverse sources into a single structure. For example, pulling video conferencing…
[...]Semantic Search
Semantic Search provides results based on semantic meaning among searched entities. Semantic search is distinguished from lexical search, which returns…
[...]Structured Data
Structured Data refers to any form of data that resides in established fields within a record. This is distinguished from…
[...]Transparency and Explainability
Transparency and Explainability of AI systems are concepts used to rate and discuss how AI systems come to the conclusions…
[...]Unstructured Data
Unstructured Data is data that does not reside in established fields within a record. This doesn’t mean that there’s not…
[...]URI
Unstructured Data is data that does not reside in established fields within a record. Examples of unstructured data include emails,…
[...]What Is A Knowledge Graph?
A knowledge graph is an extension of a graph data structure that allows data to be stored in interrelated (contextually…
[...]What is Knowledge As A Service (KAAS)?
As opposed to data as a service or information as a service, knowledge as a service provides data and context…
[...]